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1. INTRODUCTION 

Fabric defect detection is a key process in quality control which identifies and visualizes the 
appearance of fabric defects [1]-[3]. One of the most direct uses of artificial intelligence in industry is 
machine learning for automatic fabric defect detection, the successful deployment that could lead to an 
improved fabric quality and lower labor costs. In engineering settings, the use of real-time systems is 
common for both measuring and quality control tasks. In manufacturing engineering, such systems are 
generally employed for both automatic and cycle-based inspection of components, and the supervision of the 
production flow. 

The human being has a complex vision system which able him to disting defects in large and small 
scales. The objective of the present study is to provide an automatic detection system to distinguish between 
defected and non-defected zones in fabric images. This will be implemented through the vision machine and 
is expected to provide a better detection of defects with a low error rate. The automatic detection of defects 
was the aim of this work, not the classification of faults. Machine inspection of fabric is done through 
computer treatment. The only way of inspection is image-based. It takes images of fabric during 
manufacturing and treats them to identify irregularities [4]. Hence, defect detection in fabrics presents a 
significant challenge for industrial and researchers. The objective is to identify diverse anomalous structures 
in complicated contexts. 
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Multiple approaches to detect defects in various settings are currently proposed [5]-[10]. As such, 
Abouelela et al. [8] presumed that the texture of fabric is composed by basic structure and considered any 
area with a modified structure to be a defect. Due to the existence of significant differences in the frequency 
spectrum of defective and non-defective textures, Chan et al. [9] used a Fourier transform method to 
distinguish between these areas. 

This spectral method, however, is not appropriate for complicated textures. Deep learning has 
contributed immensely to the resolution of many computer vision challenges in the recent years. Certain 
approaches [11]-[17] have implemented deep neural networks to detect tissue defects. Using the trained 
architecture, Zhao et al. [16] built a convolutional neural network (CNN) based on integrated visual short and 
long term memory to discriminate between fabric fault images. Authors observed that deep learning methods 
conceived for various kinds of image classification tasks can be ideally fitted to the fabric defect 
classification challenge, indicating as well the requirement of our adequately-designed architecture. Li et al. 
[18] tested an automatic Fisher criterion-based stacked denoising auto-encoders (FCSDA) coder using equal 
size fabric image patches to categorize the testing patches into defective or non defective ones, with the 
residual between rebuilt images and defective patches as the location of the defect. It has shown good 
performance on regular and complex woven knitted jacquard patterns. Although such approaches showed 
significant success in certain applications, most of them are restricted to simple textures and unable to resolve 
complicated real world problems of fabric inspection. 

In order to increase the efficiency of real-world fabric defect detection, many issues must be 
addressed. Firstly, labeling the various faults in real-time products takes time and labor. Due to the 
sophisticated variety of fabric defects and fabric types, it is hard to collect a meaningful and detailed dataset 
covering all possible fabric textures. As such, when it comes to fabrics with non-visible textures, pre-trained 
architecture generally does not function properly. In addition, as the production process and materials vary 
for each kind of fabric, there are huge variations in the aspect and features of each defect, which also makes 
the detection of fabric defects challenging. 

In this paper, the evaluation of the ability of GoogLeNet a CNN to identify tissue faults from the 
textile texture database (TILDA) Dataset for seven different classes of fabric defaults to develop a reliable 
system whish is able to detect defect in real time. The work will be structured. First, a summary of literature 
on fabric defect detection approaches is presented, followed by the proposed method for automatic fabric 
defect detection, comprising the preparation of the dataset and the description of the proposed network 
model. At the end, findings and discussion are following by the conclusion. 


2. METHOD 

We pretrained the CNN "GoogLeNet" on ImageNet database. The learned values from ImageNet 
are transferred to the neural system and adjusted to detect the presence of defaults in the fabric images. Both 
simple and complex texture fabrics are used to fine-tune the network and achieve a reliable system which is 
able to detect defect in real-time. 


2.1. GoogLeNet architecture 

The architecture of GoogLeNet [19], proposed by Szegedy et al. in 2015, differs from other classical 
CNNs. It includes 22 layers in which the number of units in each layer has been augmented by using a 
parallel filter known as the inception module [20] of sizes 1x1, 3x3 and 5x5. Figure 1 shows the 22 layers of 
GoogLeNet. 

This model is designed to be an accurate and low computational cost for use in mobile and 
embedded systems. To make the architecture computationally efficient, the inception module with reduced 
dimensionality is used instead of the naive version. The rectified linear units (ReLU) are used as activation 
functions for all the convolutions layer of this architecture. Figure 2 show the inception module. 

One can choose to fine-tune all the layers of the architecture, or just to maintain the first layers 
frozen (for reasons of over-fitting) and refine just a certain part of the top-level architecture. The reason for 
this is the finding that the early features of a network contain the most common characteristics (e.g., edge or 
color detectors) which are expected to be relevant to a multiplicity of tasks, while the later layers of the 
network contain the most specific characteristics of the classes contained in the fabric dataset. The 
GoogLeNet network was firstly trained from the ImageNet dataset which consists of around one million of 
pictures and one thousand tags and classes. For our tagged fabric dataset; it contains just 24,000 fabric 
pictures and 2 tags/categories. Thus, the fabric dataset is not large enough for training GoogLeNet from 
scratsh, so we utilize the learned values from the ImageNet trained GoogLeNet network. We fine-tuned all 
layers except for the top two pretrained layers containing most general-purpose values that are independent 
of the data. The existing classification layer "loss3/classifier" produces predictions for 1,000 classes. Instead, 
a new binary classification layer is applied. 
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Figure 1. Shows the architecture of GoogLeNet [19] 
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Figure 2. Shows the inception module [19] 


We tuned the GoogLeNet architecture to fit with our task by changing the latest fully-connected 
layer (designed for 1,000 categories) into a binary fully connected layer. The starting filter values of the 
network learned from ImageNet are then back-propagated to more accurately represent the fabric conditions 
in the dataset. The following basic modifications are maded: 
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a) The names of the three output layers have been modified to avoid conflicts when reading the original 
weights from the pre-trained model. Thus: 

"loss 1/classifier" became " loss 1/classifier_defect"; 

"loss2/classifier" became " loss2/classifier_defect"; 

" loss3/classifier" became " loss3/classifier_defect"; 
The output layer count was reduced to two (from 1,000) to take into account the two categories: defective 
and non-defective. 
c) The basic learning rate Base_Ir was set to 0.01 and the learning rate policy is polynomial. 
d) The Max_iter, the maximal number of operations, was set to 10,000. 

In this study, the framework Caffe [21] is used as a CNN library. A pre-trained version of the 

GoogLeNet CNN is available for unrestricted use in [22]. GoogLeNet is also available in the Digits training 
system version 5.0 (Nvidia Corporation, Santa Clara, CA) [23]. 


b 


wm 


2.2. Data sets preparation 

Experiments have been conducted on the popular TILDA, is a database of fabric patterns that was 
created in the context of the workshop "Texture Analysis" of the major research project of the Deutsche 
Forschungsgemeinschaft "Automatic Visual Inspection of Technical Objects". In this workgroup, methods 
for recognizing and distinguishing textures of different kinds were investigated and evaluated [24]. This 
database consists of eight representative textile kinds, seven error classes and an error-free textile class. Thus, 
there are eight types of classes for each type of textile, including four main groups (C1-C4), each group being 
composed of two different subgroups as shown in Figures 3 and 4. Therefore, there is one fabric type in each 
sub-directory, which is split in eight sub-directories, containing 50 texture images each. First subfolder 
labeled "eO" includes non-defective images, while the rest of the subfolders ("el"-"e7") contain defective 
images. The Figure 5 shows some of common fabric defects: (a) plain fabric without defects, (b) plain fabric 
with defects, (c) plain weave fabric without defects, (d) plain weave fabric with defects, (e) twill fabric 
without defects, and (f) twill fabric with defects. 


C1 Q <) a 


Figure 3. Display the TILDA’S database four classes {C1, C2, C3, and C4} 


Figure 4. Examples of defective fabric images from TILDA database 
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(b) 


(f) 


Figure 5. Example of fabric images from TILDA database displaying: (a) pain fabric without defects, 
(b) plain fabric with defects, (c) plain weave fabric without defects, (d) plain weave fabric with defects, 
(e) twill fabric without defects, and (f) twill fabric with defects 


Fifty different images for each of the selected classes (768x512 pixels, 8-bit grayscale image) were 
obtained by moving and rotating the fabric image. The whole database of textured textiles is composed of 
3,200 images with a total size of 1.2 GB. The dimension of the images was resized from 768x512 to 
224x224.The images were randomly sorted into 90% images for learning, 5% images for validation and 5% 
images for testing. The training data set was composed of 60% images of non-defect fabric (negative class 
'0') and 40% images of defect fabric (positive class '1'). 

The current study used a total of 3,200 pictures from the TILDA database. Moreover, we enlarged 
the dataset by applying three directions of turning and rotating (90°, and 270°) data increasing techniques. 
Then, the training images were increased by changing and adapting the sharpness, luminosity, and contrast of 
the pictures with IrfanView picture editing software [25]. This is a common method to train small datasets 
more efficiently. As a result, the size of the training datasets has increased from 3,200 to 24,000 patches. 

The TILDA dataset includes 4 classes of varying textures,C1 and C2 contain non-motif based fabric 
images and C3 and C4 contain motif based fabric images. Thus, three sets of training data (S1, S2 and S3) 
was constructed, with 8,000 images for each group. The S1 group for the non-motif classes, while the S2 
group for the motif based and the S3 a mixed images from the S1 and S2. 

Two configurations were designed to refine the pre-trained GoogLeNet architecture; the first 
updates the settings of the final couple layers, while the second updates the settings of the final six layers. 
Both configurations were trained on the three learning groups. As a result, six models were created and 
tested. 


3. RESULTS AND DISCUSSION 

The overall classification accuracies, relating to the two configurations, obtained on the three 
training sets for the different numbers of iterations are shown in Tables | to 3. The accuracy, sensitivity and 
specificity are defined respectively by the following formulas; in which the true positive desined as TP, the 
false positive desined as FP, the true negative desined as TN and the false negative desined as FN: 


TP+TN 


ACCUIACY = Sac ppaeNaEN . 
Sensitivity = a " 
Specificity = a " 


Table 1. Classification accuracy of non-motif based texture images S1 over the two configurations 
Iteration 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 10,000 
Conf 1 0.954 0.956 0.94 0.96 0.959 0.957 0.966 0.967 0.965 0.969 
Conf 2 0.952 0.946 0.981 0.95 0.968 0.969 0.986 0.975 0.977 0.976 
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Table 2. Classification accuracy of motif based texture images S2 over the two configurations 
Iteration 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 _ 10,000 
Conf_l 0.965 0.980 0.964 0.988 0.986 0.984 0.990 0.991 0.988 0.990 
Conf 20.963 0.957 0.960 0.964 0.982 0.983 0.992 0.993 0.990 0.992 


Table 3. Classification accuracy of mixed based texture images S3 over the two configurations 
Iteration 1,000 2,000 3,000 4,000 5,000 6,000 7,000 8,000 9,000 __10,000 
Conf_l 0.868 =0.852 0.840 0.860 0.859 0.857 0.866 0.867 0.865 0.887 
Conf_2 0.874 (0.884 0.864 0.880 0.886 0.884 0.868 0.891 0.890 0.897 


We have recorded a maximum accuracy of 97% for non-motif based texture images and 99% for 
motif based texture images at 8,000 iterations. However, the precision recorded on mixed images was of the 
order of 86% at 7,000 iterations, which is relatively low. Figure 6 illustrates, in detail, the accuracy, 
sensitivity and specificity obtained according to the iteration number. 

In general, the six trained models scored well in terms of precision as shown in Tables 1 to 3. Also 
is noticed that in most cases, and especially for high numbers of iterations, the second configuration showed 
a better accuracy compared to the first one; 97% for the G1 Groupe, 99% for the G2 groupe, and 90% for the 
G3 groupe. This second configuration updates the parameters of the last six layers of the pre-trained 
GoogLeNet. 


Mixed images classification accuracy for the seconde fine-tuning configurations 
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Figure 6. Shows the accuracy of mixed images for the second fine-tuning configuration 


The convolutional networks provide a robust error tolerant, dual computing and self-learning 
abilities to deal with complicated environmental data issues. From the beginning of the 21* century, with the 
fast progress of big data and AI, the use of convolutional networks to detect, segment [6], [26], recognize 
[17], analyze and process data has been very successful, especially in applications with a high number of 
tagged images, for example surface finish [27], [28], industrial images [29], heath images [30]-[34] and 
weed detection [35], [36]. In our previous work [11], three famous pre-trained CNN models are compared to 
detect defect in fabric texture, and the three models achieve high accuracy over 96%. 

In this paper, the famous network “GoogLeNet” was refined to detect the presence of defect in motif 
and non-motif texture images. The aim of this investigation involved the development of an automatic fabric 
inspection system able to detect fabric defect for motif and non-motif-based fabric images. Experimental 
results show an accuracy of 97% using a specific classifier for each set and 89% for a common classifier. In 
the upcoming research, we intend to classify the tissue faults into multiple classes which will help to 
recognize the source of defects and pretaind it in the future. The study will also focus on different techniques 
such as layer locking, Dropout, Top-N as output for estimation, and the incorporation of perspective 
information to increase the precision and benchmark it with various architectures including visual geometry 
group network (VGGNet), residual network (ResNet) and MobilNet. 
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4. CONCLUSION 

In this paper, the famous network “GoogLeNet” was refined to detect the presence of defect in motif 
and non-motif texture images. The aim of this investigation involved the development of an automatic fabric 
inspection system able to detect fabric defect for motif and non-motif-based fabric images. Experimental 
results show an accuracy of 97% using a specific classifier for each set and 89% for a common classifier. In 
the upcoming research, we intend to classify the tissue faults into multiple classes which will help to 
recognize the source of defects and pretaind it in the future. The study will also focus on different techniques 
such as layer locking, Dropout, Top-N as output for estimation, and the incorporation of perspective 
information to increase the precision and benchmark it with various architectures including VGGNet, ResNet 
and Mobilnet. 
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